Languages and Compilers for Parallel Computing

chapter

Quantifying the multi-level nature of tiling interactions

Nicholas Mitchell, Larry Carter, Jeanne Ferrante, Karin Högstedt

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 1-15

Optimizations, including tiling, often target a single level of memory or parallelism, such as cache. These optimizations usually operate on a level-by-level basis, guided by a cost function parameterized by features of that single level. The benefit of optimizations guided by these one-level cost functions decreases as architectures tend towards a hierarchy of memory and of parallelism. We have identified...

chapter

Reuse-driven tiling for data locality

Jingling Xue, Chua-Huang Huang

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 16-33

This paper applies unimodular transformations and tiling to improve the data locality of a loop nest. Due to data dependences and reuse information, not all loops will and can be tiled. Therefore, the approach proposed in this paper attempts to capture as much data reuse in the cache as possible while tiling as few loops as possible. By using cones to represent the data dependences and vector spaces...

chapter

Table-lookup approach for compiling two-level data-processor mappings in HPF

Kuei-Ping Shih, Jang-Ping Sheu, Chua-Huang Huang

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 34-48

This paper presents some compilation techniques to compress holes. Holes are the memory locations mapped by useless template cells and are caused by the non-unit alignment stride in a two-level dataprocessor mapping. In a two-level data-processor mapping, there is a repeated pattern for array elements mapped onto processors. We classify blocks into classes and use a class table to record the attributes...

chapter

Code generation for complex subscripts in data-parallel programs

J. Ramanujam, Swaroop Dutta, Arun Venkatachar

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 49-63

Data parallel languages like High Performance Fortran, demand efficient compile and run-time techniques for tasks such as address generation. Array references with arbitrary affine subscripts can make the task of compilers for such languages highly involved. This paper deals with the efficient address generation in programs with array references having two types of commonly encountered affine references,...

chapter

Automatic data decomposition for message-passing machines

Mirela Damian-Iordache, Sriram V. Pemmaraju

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 64-78

The data distribution problem is very complex, because it involves trade-off decisions between minimizing communication and maximizing parallelism. A common approach towards solving this problem is to break the data mapping into two stages: an alignment stage and a distribution stage. The alignment stage attempts to increase parallelism, while the distribution stage attempts to decrease communication...

chapter

Program analysis of overlap area usage in self-similar parallel programs

Aaron Sawdey, Matthew O'Keefe

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 79-93

Highly parallel computers have the memory capacity and potential speed to perform very high-resolution time-dependent calculations. Parallel computers with hundreds of fast processors require highly scalable algorithms to avoid wasting expensive resources. On these machines careful attention must be given to program design to fully exploit scalable algorithms. We have proposed a programming model...

chapter

Analysis and optimization of explicity parallel programs using the parallel program graph representation

Vivek Sarkar

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 94-113

chapter

Concurrent static single assignment form and constant propagation for explicitly parallel programs

Jaejin Lee, Samuel P. Midkiff, David A. Padua

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 114-130

Static Single Assignment (SSA) form has shown its usefulness as a program representation for code optimization techniques in sequential programs. We introduce the Concurrent Static Single Assignment (CSSA) form to represent explicitly parallel programs with interleaving semantics and post-wait synchronization. The parallel construct considered in this paper is cobegin/coend. A new confluence function,...

chapter

Identifying DEF/USE information of statements that construct and traverse dynamic recursive data structures

Yuan-Shin Hwang, Joel Saltz

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 131-145

Pointer analysis is essential for optimizing and parallelizing compilers. It examines pointer assignment statements and estimates pointer-induced aliases among pointer variables or possible shapes of dynamic recursive data structures. However, previously proposed techniques perform pointer analysis without the knowledge of traversal patterns of dynamic recursive data structures to be constructed....

chapter

Program optimization for concurrent multithreaded architectures

Jenn-Yuan Tsait, Zhenzhen Jiang, Pen-Chung Yew

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 146-162

This paper presents some compiler and program transformation techniques for concurrent multithreaded architectures, in particular the superthreaded architecture [9], which adopts a thread pipelining execution model that allows threads with data dependences and control dependences to be executed in parallel. In this paper, we identify several important program analysis and transformation techniques...

chapter

Interactive compilation and performance analysis with URSA MINOR

Insung Park, Michael Voss, Brian Armstrong, Rudolf Eigenmann

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 163-176

This paper proposes solutions to two important problems with parallel programming environments that were not previously addressed. The first issue is that current compilers are typically black-box tools with which the user has little interaction. Information gathered by the compiler, although potentially very meaningful for the user, is often inaccessible or hard to decipher. Second, compilation and...

chapter

The SPNT test: A new technology for run-time speculative parallelization of loops

Tsung-Chuan Huang, Po-Hsueh Hsu

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 177-191

The only way for parallelizing compilers to exploit potential parallelism of loops in which dependence information is inadequate statically is using run-time loop parallelization technique. There are two approaches in this field: the inspector-executor method [17] and the speculative DOALL test [13]. For the former approach, there always incurs heavy preprocessing overhead during inspector phase and...

chapter

Lowering HPF procedure interface to a canonical representation

Jan Borowiec, Arthur Veen

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 192-203

Handling the procedure interface in an HPF compiler is complex due to the many possible combinations of Fortran 90/HPF properties of an actual array argument and its associated dummy argument. This paper describes an algorithm that reduces this complexity by mapping all the combinations of properties to a small set of canonical Internal Representations. These internal representations as well as the...

chapter

PCRC-based HPF compilation

Guansong Zhang, Bryan Carpenter, Geoffrey Fox, Xiaoming Li, more

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 204-217

This paper describes an ongoing effort supported by ARPA PCRC (Parallel Compiler Runtime Consortium) project. In particular, we discuess the design and implementation of an HPF compilation system based on PCRC runtime. The approaches to issues such as directive analysis and communication detection are discussed in detail. The discussion includes fragments of code generated by the compiler.

chapter

Data parallel language extensions for exploiting locality in irregular problems

Guillermo P. Trabado, Emilio L. Zapata

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 218-234

Many large-scale computational applications contain irregular data access patterns related to unstructured problem domains. Examples include finite element methods, computational fluid dynamics, and molecular dynamics codes. Such codes are difficult to parallelize efficiently with current HPF compilers. However, most of these problems exhibit spatial locality. This property is exploited by our approach...

chapter

Simplifying control flow in compiler-generated parallel code

John Mellor-Crummey, Vikram Adve

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 235-239

This extended abstract motivates and briefly describes a strategy for computing symbolic constraints on values of integer variables and using them to simplify the control flow of compiler-generated parallel programs. This strategy has been implemented and evaluated in context of the Rice dHPF compiler for High Performance Fortran.

chapter

Reducing synchronization overhead for compiler-parallelized codes on software DSMs (extended abstract)

Hwansoo Han, Chau-Wen Tseng, Pete Keleher

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 240-245

chapter

An array data flow analysis based communication optimizer

Xin Yuan, Rajiv Gupta, Rami Melhem

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 246-260

We present an efficient array data flow analysis based global communication optimizer which manages the analysis cost by partitioning the data flow problems into subproblems and solving the subproblems one at a time in a demand driven manner. In comparison to traditional array data flow based techniques, our scheme greatly reduces the memory requirement and manages the analysis time more effectively...

chapter

A compiler abstraction for machine independent parallel communication generation

Bradford L. Chamberlain, Sung-Eun Choi, Lawrence Snyder

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 261-276

In this paper, we consider the problem of generating efficient, portable communication in compilers for parallel languages. We introduce the Ironman abstraction, which separates data transfer from its implementing communication paradigm. This is done by annotating the compiler-generated code with legal ranges for data transfer in the form of calls to the Ironman library. On each target platform, these...

chapter

The aggregate function API: It's not just for PAPERS anymore

H. G. Dietz, T. I. Mattox, G. Krishnamurthy

Lecture Notes in Computer Science > Languages and Compilers for Parallel Computing > 277-291

INFONA - science communication portal

Languages and Compilers for Parallel Computing
10th International Workshop, LCPC'97 Minneapolis, Minnesota, USA, August 7–9, 1997 Proceedings

Quantifying the multi-level nature of tiling interactions

Reuse-driven tiling for data locality

Table-lookup approach for compiling two-level data-processor mappings in HPF

Code generation for complex subscripts in data-parallel programs

Automatic data decomposition for message-passing machines

Program analysis of overlap area usage in self-similar parallel programs

Analysis and optimization of explicity parallel programs using the parallel program graph representation

Concurrent static single assignment form and constant propagation for explicitly parallel programs

Identifying DEF/USE information of statements that construct and traverse dynamic recursive data structures

Program optimization for concurrent multithreaded architectures

Interactive compilation and performance analysis with URSA MINOR

The SPNT test: A new technology for run-time speculative parallelization of loops

Lowering HPF procedure interface to a canonical representation

PCRC-based HPF compilation

Data parallel language extensions for exploiting locality in irregular problems

Simplifying control flow in compiler-generated parallel code

Reducing synchronization overhead for compiler-parallelized codes on software DSMs (extended abstract)

An array data flow analysis based communication optimizer

A compiler abstraction for machine independent parallel communication generation

The aggregate function API: It's not just for PAPERS anymore

Filter options

Publication date

Keywords

INFONA - science communication portal

Languages and Compilers for Parallel Computing 10th International Workshop, LCPC'97 Minneapolis, Minnesota, USA, August 7–9, 1997 Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

Languages and Compilers for Parallel Computing
10th International Workshop, LCPC'97 Minneapolis, Minnesota, USA, August 7–9, 1997 Proceedings